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Abstract 

The study of genetic map linearization leads to a combinatorial hard problem, called the 
minimum breakpoint linearization (MBL) problem. It is aimed at finding a linearization of a 
partial order which attains the minimum breakpoint distance to a reference total order. The 
approximation algorithms previously developed for the MBL problem are only applicable to 
genetic maps in which genes or markers are represented as signed integers. However, current 
genetic mapping techniques generally do not specify gene strandedness so that genes can only 
be represented as unsigned integers. In this paper, we study the MBL problem in the latter 
more realistic case. An approximation algorithm is thus developed, which achieves a ratio 
of (m? + 2m — 1) and runs in 0(n 7 ) time, where rn is the number of genetic maps used to 
construct the input partial order and n the total number of distinct genes in these maps. 

Index terms — Comparative genomics, partial order, breakpoint distance, feedback vertex set. 


1 Introduction 

Genetic map linearization is a crucial preliminary step to most comparative genomics studies, 
because they generally require a total order of genes or markers on a chromosome rather than 
a partial order that current genetic mapping techniques might only suffice to provide El SL 8] 0. 
One of the computational approaches proposed for genetic map linearization is to find a topological 
sort of the directed acyclic graph (DAG) that represents the input genetic maps while minimizing 
its breakpoint distance to a reference total order. It hence leads to a combinatorial optimization 
problem, called the minimum breakpoint linearization (MBL) problem [El, which has attracted 
great research attention in the past few years El El 31 [6J. 

The MBL problem is already shown to be NP-hard El, and even APX-hard Q. The first algo¬ 
rithm proposed to solve the MBL problem is an exact dynamic programming algorithm running in 
exponential time in the worst case El- In the same paper, a time-efficient heuristic algorithm is also 
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presented, which, however, has no performance guarantee. The first attempt was made in 0 to 
develop a polynomial-time approximation algorithm. Unfortunately, the proposed algorithm was 
latter found invalid 0 because it relies on a flawed statement in 0 on adjacency-order graphs. 
To fix this flaw, the authors of 0 revised the construction of adjacency-order graphs and proposed 
three approximation algorithms, two of which are based on the existing approximation algorithms 
for a general variant of th e, feedback vertex set problem, and the third was instead developed in the 
same spirit as was done in 0, achieving a ratio of (m 2 + 4m — 4) (only for m > 2). 

As we shall show in Section [231 the above approximation algorithms are only applicable to the 
input genetic maps in which genes or markers are represented as signed integers, where the signs 
represent the strands of genes/markers. However, we note that the original definition of the MBL 
problem assumes unsigned integers for genes 0. In fact, this is a more realistic case. Current 
genetic mapping techniques such as recombination analysis and physical imaging generally do not 
specify gene strandedness so that genes can only be represented as unsigned integers 0. Based 
on this observation, whether the MBL problem can be approximated still remains a question not 
yet to be resolved. 

In this paper, we study the MBL problem in the more realistic case where no gene strandedness 
information is available for the input genetic maps. We revised the definition of conflict-cycle in 
0, from which an approximation algorithm is hence developed also in the same spirit as done in 
00. It achieves a ratio of (m 2 + 2m — 1) (which holds for all m > 1) and runs in 0(n 7 ) time, 
where m is the number of genetic maps used to construct the input partial order and n the total 
number of distinct genes occurring in these maps. 

The rest of the paper is organized as follows. We first introduce some preliminaries and nota¬ 
tions in Section 0 In Section [3] we discuss a number of basic facts about the MBL problem, which 
leads to the formulation of the minimum breakpoint vertex set (MB VS) problem in Section 0 We 
present an approximation algorithm for the MBL problem via the approximation of the MB VS 
problem in Section 0 and then conduct performance analyses on both its approximation ratio and 
running time in Section [51 Finally, some concluding remarks are made in Section [6] For the sake 
of consistency, we borrowed many notations from 0 and 0 throughout the paper. 


2 Preliminaries and notations 

2.1 Genetic maps and their combined directed acyclic graph 

A genetic map is a totally-ordered sequence of blocks, each of which comprises one or more genes. 
It defines a partial order on genes, where genes within a block are ordered before all those in its 
succeeding blocks, but unordered among themselves. 

Today it is increasingly common to find multiple genetic maps available for a same genome. 
Combining these maps often provides a partial order with a higher coverage of gene ordering than 
an individual genetic map. To represent this partial order, we may construct a directed acyclic 
graph If = (E, D), where the vertex set £ = (1,..., n} is made of all the contributing genes and 
the arc set D made of all the ordered pairs of genes appearing in consecutive blocks of the same 
genetic map [[710. Two properties can be deduced 0 from these genetic maps and their combined 
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Figure 1: The construction of an adjacency-order graph as proposed in [[3]]. The symmetric arcs in 
F are represented as double arrows. 


directed acyclic graph: (i) if there is an arc between two genes i and j in II, then i and j appear in 
consecutive blocks of some genetic map, and (ii) if i and j appear in different blocks of the same 
genetic map, then there exists in II a nonempty directed path either from i to j or from j to i. See 
Figure [Qfor a simple example of II constructed from two genetic maps. 

We say gene i is ordered before (resp. after ) gene j by II if there exists in II a nonempty 
directed path from i to j (resp. j to i ). We use i -<n j to denote the ordering relation that gene 
i is ordered before gene j by II. Unlike in [[51, we assume in this paper that combining multiple 
genetic maps would never create order conflicts, i.e., we could not have both i -< n j and j -< n i 
simultaneously. 

2.2 The minimum breakpoint linearization problem 

Let n = (£,£>) be a directed acyclic graph representing a partial order generated with m genetic 
maps of a same genome. A linearization of II is a total order of genes 7r = 7r(l) • 7 t(2) • • -vr(n), 
i.e., a permutation on {1, 2,..., n}, such that, for all genes i, j, if i -< n j then % -< n j. In this case, 
7 r is said to be compatible with II. Let F denote another genome with the same set of genes in a 
total order. Without loss of generality, we assume that F is the identity permutation 12 ■■■ n. A 
pair of genes that are adjacent in 7r but not in F is called a breakpoint of 7r with respect to T, and 
the total number of breakpoints is thus defined as the breakpoint distance between 7r and F ||T1 . 

Given a partial order II and a total order V as described above, the minimum breakpoint lin¬ 
earization (MBL) problem is defined as to find a linearization 7r of II such that the breakpoint 
distance between 7r and V is minimized |2l. This minimum breakpoint distance is further referred 
to as the breakpoint distance between II and F, and denoted by d b (\\. T). 
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2.3 Adjacency-Order Graph 

In this study we adopt the definition of adjacency-order graph introduced in 0. To construct 
an adjacency-order graph for a partial order II = (E,D), we first create a set W of vertices 
representing the adjacencies of the identity permutation T by W = {i ■ (i + 1) 11 < i < n}, and 
let V = £ U W (see Figured]:). We will not distinguish the vertices of £ and their corresponding 
integers, which is always be clear from the context. Then, we construct a set of arcs F as 

F = {i ■ (i + 1) —>■ i, i ■ (i + 1) —>■ i + 1, i —> i ■ (i + 1), i + 1 —>■ i ■ (i + 1) | 1 < % < n}, 

where the arrow —> is used to denote an arc. Note that every arc in F has one end in W and the 
other end in £. Let E = 1) LJ F (see Figure [T}1). Finally, we define the adjacency-order graph Gn 
of II by G n = (V,E). 

Note that in G n , the arcs of D may go either top-down or bottom-up. Let X[G n ] (or only X, 
if there is no ambiguity) be the set of arcs in D that go top-down, and Y [Gn] (or only Y) the set 
of arcs in D that go bottom-up. Formally, we may write X[Gn] — {i—^jED\i>j} and 
F [Gn] = {i —>■ j G D \i < j}. It is easy to see that D = X UY and In Y = 0. 

In 01, a conflict-cycle refers to a cycle that uses an arc from X. By this definition, a conflict- 
cycle may not necessarily use any arc from Y and all its adjacencies might still co-exist in some 
linearization of II, as we can see from the adjacency-order graph Gn shown in Figure d]I. This 
adjacency-order graph contains a conflict-cycle 3—>■ 3 • 4 —> 4 —>• 4 • 5 —>• 5 —>• 3 (as defined in 
0), for which both adjacencies 3 • 4 and 4 • 5 may occur in the linearization 1 2 5 4 3 of II. Based 
on these observations, in this study we use a different definition of conflict-cycles as follows. 

Definition 2.1 A cycle in Gn is called a conflict-cycle if it contains at least one arc from X and at 
least one arc from Y. 

This new definition has wide implications for the future approximation of the MBL problem, as 
we shall see latter. A quick look indicates that the example cycle mentioned above is no longer 
a conflict-cycle. In Theorem 13.101 we shall prove that the adjacencies involved in a conflict- 
cycle could not co-exist in any linearization of II. Consequently, we need to remove at least one 
adjacency from each of those cycles in order to obtain a linearization of II. 

Most of the following notations are already introduced in 0. An arc between u and v is 
written u —> v, or u -¥a v if it belongs to some set A. A path P is a (possibly empty) sequence 

P P 

of arcs written u —> *v, or u —> * A v if P uses arcs only from A. A nonempty path 0 is written 

as u —» + v with a + sign. A cycle is a nonempty path u —> + v with v = u. Given a path 
P = v 0 —> Vi —>■•••—>■ Vi in G n , the following notations are used: l(P) = l is the length of P, 
V{P ) = {V h I o < h < 1}, W{P) = V(P) n w, E(P) = V(P) n £, e(p) = { Vh -> v h+1 \ o < 
h < l}, F(P ) = E(P) n F, D(P ) = E(P) n D, X(P) = E(P) n X, and Y(P) = E(P ) D Y. 
A cycle C is said to be simple if all vertices V}, are distinct except v 0 = Vh, which implies that 
1(C) = |C(C)| = \E(C)\. If a cycle C is not simple, then it contains a subcycle C' such that 
V(C') C V(C) and E(C') C E(C ). In this paper, we further require C' C when C' is the 
subcycle of C. 
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3 Some basic facts 


Given a cycle C in G n , we may partition W(C) into a collection of disjoint subsets W h (C) such 
that each of them can be written as {i- (i+1) | ah < i < b h }, for some integers ah and b h . We denote 
such a collection of disjoint subsets with minimum cardinality by W(C) = {Wi(C),W 2 (C), ■ ■ ■ ,WflC)}. 
Note that, for every cycle C in Gn, we have l = |W(C)| > 1 because II = (E, D) is a directed 
acyclic graph. 

Lemma 3.1 Let C be a (not necessarily simple) cycle with WflC) = {i ■ (i + 1) | a± < i < b±} 
and W 2 (C) = {i ■ (i + 1) | a 2 < i < b 2 } being two distinct elements o/W(C). Then, we have 
[aiA] n [a 2 , b 2 ] = 0. 

Proof. By contradiction, suppose that [a\,bi] D [ 02 , 62 ] f 0, which implies that a\ < b 2 and 
a 2 < b\. Let a = min(ai,a 2 ) and b = max( 61 , 6 2 ), and let W[{C) = {i ■ (i + 1) | a < i < b}. 

For Mi E [ai, 61] U [a 2 , b 2 \, we have i E [a, b], which implies that W\[C ) U W 2 {C) C W'flC). Next 

we show that, for Vi G [a, b), we have either i E [ai,6i) or i E [02,62)- If i ^ [oi, 61), then 

i > 61 since i > ai and, further, i > a 2 since a 2 < b\. On the other hand, we have i < b 2 

because i < 6 = max(6i,6 2 ). It hence follows that % e [a 2 ,6 2 ) if i £ [01,61). No matter in 
which case, i.e., either i e [a l7 61) or i e [a 2 , 6 2 ), we can have W[(C) C WfC) U W 2 (C). Thus, 
W\{C) U W 2 {C) = W[(C). Consequently, we can obtain a smaller-sized partition of W{C) by 
replacing two sets Wi(C) and W 2 (C) of the current partition W (C) with one set W[(C), which 
however contradicts the fact that W(C) attains the minimum cardinality. ■ 

Lemma 3.2 LetC be a (not necessarily simple) cycle with W\{C) = {i ■ (i + 1) | a < i < 6 } being 
an element o/W(C). If there exists a vertex c E E (C) such that c ^ [a, 6], then C is a conflict-cycle. 

Proof. We first assume that c < a. Define a + = {i\i > a} U {i ■ (i + l)|i > a} and 
a~ = {i\i < a} U {i ■ (i + l)|i < a}. Then, a + U a~ is a partition of V. Note that there exists 
in F exactly one arc from a + to a~ and exactly one arc from a~ to a + , i.e., a —> (a — 1)^ • a 
and (a — l)p ■ a a, respectively. Suppose that C does not contain any arc from X. Since C 
contains vertices in both a + and a~ (resp. 6 and c), it thus contains an arc u —> v with u G a + and 
v G a~. We must have u —>■ v G F; otherwise, u —y v G D implies that u —> v G X since u > v. 
Consequently, we can only have u = a and v = (a — 1) • a by the definitions of a + and a~. So, C 
uses the vertex (a — 1) • a. However, WflC) = {i ■ (i + 1) | a < i < 6 } is an element of W(C), 
which, by definition, implies that C does not use the vertex (a — 1) • a; a contradiction. Therefore, 

C must contain an arc from X. Now we suppose that C does not contain any arc from Y. Once 
again, since C contains vertices in both a + and a~, it thus contains an arc u —* v with u E a~ and 
v E a + . We must have u —* v E F\ otherwise, u -E v E D implies that u —v E Y since u < v. 
Consequently, we can only have u — (a — 1) ■ a and v = a. So, C also necessarily uses the vertex 
(a — 1) • a. As we show above, it would lead to a contradiction. Therefore, C must contain an arc 
from Y too. It turns out that C is a conflict-cycle. 

In case of c > 6 , we may define 6 + = {i\i > 6 } U {i ■ [i + 1)|z > 6 } and 6 - = {i\i < 

6 } U {i ■ (i + 1) |* < 6 }. Then, by using the same arguments as above, we can also show that C is a 
conflict-cycle. ■ 
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Lemma 3.3 Let tt be a total order that contains every adjacency in the set {i ■ (i + 1) | a < i < b}. 
Then, either the sequence a (a + 1) (a + 2) • • • b or b (b — 1) (b — 2) ■■■ a is an interx’al ofir. 

Proof. Recall that an adjacency i ■ (i +1) implies the occurrence of an interval either i (i + 1) or 
(i + 1) i, but not both, in n. We first consider the adjacency a ■ (a + 1), for which the interval either 
a (a + 1) or (a + 1) a would occur in tt. We distinguish these two cases when the next adjacency 
(a + 1) • (a + 2) is considered. In the first case of the interval a (a + 1), in order to obtain the 
adjacency (a + 1) • (a + 2) in tt, the element (a + 2) can only appear immediately after the element 
(a + 1), resulting in the interval a (a + 1) (a + 2). In the second case of the interval (a + 1) a, in 
order to obtain the adjacency (a +1) • (a + 2) in tt, the element (a + 2) can only appear immediately 
before the element (a + 1), resulting in the interval (a + 2) (a + 1) a. Continue this process with 
the remaining adjacencies in the increasing order of elements. It would necessarily end up with an 
interval either a (a + 1) (a + 2) • • • b or b (b — 1) (b — 2) • • • a in tt. ■ 

Lemma 3.4 Let tt be a total order that contains every adjacency in the set {i ■ {i + 1) | a < i < b}. 
Assume that there exists in G n an arc i\ —> f £ D, where a < f <b and a < i 2 < b. Iff < i 2 
(resp., ii > i 2 ), then the sequence a (a + 1) (a + 2) • • • b (resp., b (b — 1) (6 — 2) ■ ■ ■ a) is an 
interx’al ofir. 

Proof. The proof is given only for the case of t\ < i 2 . We know from Lemma [3731 that tt con¬ 
tains either the interval a (a + 1) (a + 2) • • • f ■ ■ ■ i 2 ■ ■ ■ b or b (b — 1) (b — 2) • • • i 2 ■ ■ ■ i 1 • • • a. 
On the other hand, we have f -< n i 2 , since there exists an arc f —* i 2 e D. Consequently, the 
interval b (b — 1) (b — 2) • • • i 2 ■ ■ ■ f ■ ■ ■ a could not appear in tt. ■ 

We wish to distinguish two types of conflict-cycles. A conflict-cycle C is said to be of type I if 
there exist two vertices a and b in £(C) such that V(C) = {i ■ (i +1) | a < i < b} U {i \ a < i < b}; 
otherwise, it is said to be of type II. For example, in the adjacency-order graph shown in Figure [0 
the cycle 1 —y 2 —y 2 • 3 —y 3 —y 3 • 4 —y 4 —y 4 • 5 —y 5 —y 3 —y 2 • 3 —y 2 —y 1 • 2 —y 1 is a conflict- 
cycle of type I, while both 2—)-5—^3—)-2-3—)-2 and 2 —y 4 —y 4 ■ 5 —y 5 —y 3 —y 2 ■ 3 —y 2 
are conflict-cycles of type II. Lemmas 13.51 and 13.61 below follows from the above definitions in a 
straightforward way. 

Lemma 3.5 Let C is a (not necessarily simple) conflict-cycle of type 1. Then, |W(C)| = 1. 

Lemma 3.6 LetC is a (not necessarily simple) cycle with WflC) = {i ■ (i + 1) | a < i < b} being 
an element qfW(C). Then, C is a conflict-cycle of type II iff there exists a vertex c G £(C) such 
that c [a, b\. 

By considering Lemmas [3.11 and [T2l we can further obtain the following lemma. 

Lemma 3.7 Let C be a (not necessarily simple) cycle with |W(C)| > 2. Then, C is a conflict-cycle 
of type 11. 

The first implication of our new definition of conflict-cycle is that a conflict-cycle does not 
necessarily contain a simple conflict-subcycle. 
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Lemma 3.8 IfC is a conflict-cycle of type I, then it cannot be a simple cycle. 

Proof. By contradiction, suppose that C is simple. By definition of a type I conflict-cycle, there 
exist two vertices a and b such that V(C) = {i ■ (i + 1) | a < i < b} U {i \ a < i < b}. Since 
C is simple, every vertex in V(C) is adjacent to exactly two distinct vertices in C; therefore, every 
vertex has indegree and outdegree both exactly one in C. Knowing that every vertex i ■ {i + 1) e W 
has only two distinct adjacent vertices in G'n, i.e., i and (z + 1 ), we can deduce that, for every vertex 
i such that a < i < b, it is adjacent to both (i — 1) ■ i and i ■ (i + 1) by using arcs from F. And, 
the vertex a is adjacent to a ■ (a + 1 ) and the vertex b is adjacent to (b — 1 ) • b, both using arcs also 
from F. Consequently, C shall contain an arc between a and b so that both vertices have degree 
two (because any other vertices can no longer be incident to an arc of D{C)). Moreover, this arc is 
the only arc that C has from D(C), which contradicts the fact that a conflict-cycle shall contain at 
least two arcs from D(C), i.e., at least one from X(C ) and at least one from Y(C). ■ 

Lemma 3.9 IfC is a non-simple conflict-cycle of type II, then it must contain a simple conflict- 
subcycle of type II. 

Proof. Let W(C) = {PLi(C), W 2 (C), • • • , WflC)}. Since C is not simple, there exists a vertex 

u used twice in it such that C = u —> + u —> + u. We can further assume that u e £(C). If initially 

we have u E W(C) such that u — a ■ (a + 1), then C uses both vertices a and (a + 1) at least 

twice because it uses the vertex u = a ■ (a + 1) twice. So, we may substitute a by a to write 

n p + Q + 

C = u —> + u —> + u. 

P Q 

Let C\ — u —> + u and C 2 = u —> + u. Apparently, C\ and C 2 are two subcycles of C, so we 
write W(C0 = {WflCfl, W 2 {Cf), • • • , W^Cfl} and W(C 2 ) = {Wi(C 2 ), W 2 (C 2 ), ■ , W h (C 2 )}, 

where l\ > 1 and l 2 > 1. Note that every element of W(Ci) and of W(C 2 ) is a subset of an element 
of W(C). Below we distinguish two possible cases. 

In the first case, we assume that there exist an element of W(Ci) and an element of W(C 2 ) (say, 
Wi (Ci) = {i ■ (i + 1 ) | an < i < bn} and WflC 2 ) = {i ■ (i + 1) | a 2i < i < b 21 }, respectively) 
such that both are the subsets of a same element of W(C) (say, WflC) = {i-(z + l) | ai < i < bi}). 
It hence implies that a\ < an < bn < and a\ < a 2 i < b 2 \ < b\. Since C is a conflict-cycle of 
type II, by Lemma lL 6 l there exists a vertex c\ e S(C) such that c\ [ai, b\\. Thus, we have both 
Ci ^ [an, bn] and ci ^ [a 21 , 6 21 ]. Note that the vertex ci appears on the cycle either C\ or C 2 . If Ci 
appears on C \, then C\ is a conflict-cycle (by Lemma [Ol) . Otherwise, c 2 must appear on C 2 . By 
Lemma I3T21 once again, C 2 would be a conflict-cycle. Moreover, this conflict-cycle, no matter C\ 
or C 2 , is of type II (by Lemma [3771) . 

In the second case, we assume that no two elements of W(Ci) and W(C 2 ) are the subsets 
of a same element of W(C). Consider the first elements of W(Ci) and W(C 2 ), and write them 
as WflCi) — {i ■ (i + 1) | an < i < bn} and ICi(C 2 ) = {i ■ {i + 1) | a 21 < i < b 2 i}, 
respectively. Note that WflCi) and IL| (C 2 ) are the subsets of two distinct elements (say, WflC) = 
{i ■ {i + 1) | ai < i < bi} and W 2 (C) — {i ■ (i + 1) | a 2 < i < b 2 }) of W(C), respectively). Thus, 
we have [a n ,& n ] C [aiA] and [a 21 ,b 21 \ C [a 2l b 2 \ and, furthermore, [anflu] 0 [a 21 , 6 21 ] = 0 
since [a 1; 61 ] D [a 2 , b 2 ] = 0. It then follows that we have either u ^ [an, bn] or u ^ [a 21 , 6 21 ]. If 
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u f. [an, 6 n], C\ would be a conflict-cycle of type II. If u f [a 2 i, 621 ], C 2 would be a conflict-cycle 
of type II. 

In either case above, we already show that there exists a conflict-subcycle of type II for C. If 
this conflict-subcycle is not simple, we may apply the above process recursively, which necessarily 
ends up with a simple conflict-subcycle of type II. ■ 

Although the following theorem appears as a verbatim account of Theorem 4 in [J3]], they are 
literally not the same because conflict-cycles are defined in different ways. Consequently, the 
corresponding proof given in [0 is not sufficient. 

Theorem 3.10 Let II be a partial order, G n = ( V, E) its adjacency-order graph (with V = EUW 
and E = D U F), and W C W. Then there exists a total order tt over £, compatible with II, and 
containing every adjacency from W iff Gji[W U £] has no conflict-cycle. 

Proof. (=>) Let n be a linearization of II containing every adjacency of W'. We suppose, by 
contradiction, that there exists in GuflV U E] a conflict-cycle C. Below we distinguish two cases, 
depending on whether C is of type I or of type II. 

In the first case, C is assumed to be of type I. By definition, there exist two integers a and b 
such that W(C) = {i ■ (i + 1) | a < i < b} and E(C) = {i \ a < i < b}. Since C is a conflict-cycle, 
there exists an arc i\ —> ji E X such that a < j\ < i\ < b and an arc i 2 —*■ j 2 E Y such that 
a < i 2 < j 2 < b. By Lemma [3~4l the arc i\ —)■ j\ implies that the sequence b (b — 1) (b — 2) • • • a 
appears as an interval of 7 r, while at the same time the arc i 2 —>• j 2 implies that the sequence 
a (a + 1) (a + 2) ■ ■ - b appears as an interval of tt\ a contradiction. 

In the second case, C is assumed to be a conflict-cycle of type II. W.l.o.g, we may further 
assume that C is a simple conflict-cycle of type II (by Lemma lT9l) . Let C — v 0 —> v\ —V ■ ■ ■ —> 
vi = t'o where all the vertices are pairwise distinct except v 0 = Vi. Let z 0 = 0, 71 ...., ih-i, ih = l 
be the increasing sequence of indices such that v l;j —t v ij+1 E D for all j such that 0 < j < h. Note 
that h > 2 (because 1){C) > 2) and, for all j, we have v t/ E T. Let us prove that for all j < K 
the ordering relation v ij -< n v ij+1 holds. The case where i ]+ \ = ij + 1 is easy, since the arc v tj —> 
v ij+ i E D implies that v ij -<n v ij+1 (by construction of 6 'n) and v tj -< n v ij+1 (since 7 r is compatible 
with II). Now, assume there are several arcs between v l;j and v ij+1 , i.e., v ij+1 = v l;j +m with m > 2. 
Let P = v ij+ i -E v ij+2 —>•••—> v ij+m , in which all the arcs are from F and v tj + i . v ij+m E E. If 
v ij+ i < v ij+m , then W(P) = (L (i + 1) | v ij+1 < i < v ij+m } and £(P) = {i \ v ij+1 < i < v ij+m }. 
By Lemma [3Jl the sequence v ij+ \ (v ij+ i + 1) (vi :i +1 + 2) • • • v lj+rn appears as an interval of tt. If 
v ij+l > v ij+m , then W(P ) = {i- (i + 1) | v ij+m < i < v ij+1 } and £(P) = {i \ v ij+m < i < v ij+1 }. 
Again, by Lemma [331 the sequence v ij+m (v ij+rn — 1) (v ij+m — 2) • • • v ij+1 appears as an interval 
of tt. In either case, all the vertices in E(P) therefore appear as an interval of tt. Note that v l;j is 
a vertex distinct from v ij+1 (since h > 2 ), and from other vertices in the set £(P) as well (since 
each of them is the source of an arc from F in C, where v ij+1 is the source of an arc from I) in C). 
Consequently, v tj cannot appear inside either of the intervals v ij+ i (v ij+ i +1) (v ij+ i + 2) • • • v ij+rn 
or v ij+m (y ij+m - 1) (y ij+m - 2) • • • v ij+1 of n. As v t;i precedes v ij+1 in II (and thus in tt), we have 
v i:j -<tt Vi for all i E \ij + 1, ij + m], and particularly, v lrj v ij+1 . 
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In conclusion, we have v tj -< n v ij+1 for all j < h and v, h = v io , leading to a contradiction since 
there is no cycle in the ordering relation Therefore, the subgraph Gn\W' U S] does not contain 
any conflict-cycle. 

(<£=) ( constructive proof) We use the following method to construct a linearization 7r of II 
containing all adjacencies of W', where the subgraph G' = Gu[W' U £], is assumed to contain 
no conflict-cycles. We denote by Vj,..., Vj ; the strongly connected components of G', ordered 
by topological order (i.e., if u,v G Vj, there exists a path from u to v; moveover, if u G Vj and 
v G Vj and there exists a path u —V n in G', then i < j). Then, we sort the elements of each 
set Vi fl S in descending order of integers if there exists an arc from X connecting two vertices 
in Vj (T S; otherwise, sort them in ascending order. The resulting sequence is denoted by and 
the concatenation pi ■ p, 2 ■ • • • gives it, a total order over £. We now check that it contains every 
adjacency in W' and is compatible with II. 

Let a ■ (a + 1) G W'. Vertices a and a + 1 are in the same strong connected component Vj, 
due to the arcs a GG a ■ (a + 1) -H- (a + 1). Those two elements are obviously consecutive in the 
corresponding fi t , and appear as an adjacency in 7 r. 

To show that it is compatible with II, it suffices by showing that a -< n b holds for every arc 
a —* b G D. By contradiction, suppose that there exist two distinct elements a, b G £ such that 
a —?• b G D but b a. We denote by i and j the indices such that a G Vj and b G Vj. Since 
b -<„ a, we have j < i, and since a —^ b G D (the arc a —^ b in G' as well), we have i < j. We thus 
deduce that % — j; therefore, a and b share the same strong connected component. If a —> b G X, 
then a > b and a -< n b (by the construction of 7r); a contradiction. Therefore, a —» b G Y, which 
then implies that a < b. Since b -< n a, by the construction of n once again, there must exist an 
arc c —» d G X such that c and d belong to the same strong connected component as a and b. 
It hence follows that there exists a path 1\ from b to c in G' and also a path P 2 from d to a in 
G'. Consequently, we obtain a cycle a —> Y b — 3 f *c —> x d *a, which, by definition, is a 
conflict-cycle in G ; a contradiction. ■ 


4 Approximation 

4.1 Approximation of the MBL problem 

To assist in solving the minimum breakpoint linearization problem, the above theorem motivates 
us to formulate a new combinatorial optimization problem on an adjacency-order graph. Given an 
adjacency-order graph G*n = (V, E), where V = £ U W with E = D U F and D = X U Y, a 
subset W of I L is called a breakpoint vertex set if the deletion of vertices in W leaves the induced 
subgraph G'n [V — W ] without any cycle using arcs from both X and Y. The minimum breakpoint 
vertex set (MBVS) problem is thus defined as the problem of finding a breakpoint vertex set with 
minimum cardinality. Theorem 13 .1 01 leads to the following corollary. 

Corollary 4.1 The value k of an optimal solution of MBL(Il) is the size of the minimum break¬ 
point vertex set ofGu- 
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Algorithm Approx-MBL 

input A directed acyclic graph II = (£, D) 

output A linearization 7r of II 

begin 

Create the adjacency-order graph G' n = (V, E) of II; 

W" <- Approx-MBVS (G n ); 

W' <- W- W"; 

(Vj, V 2 ,..., V h ) <- SCC-sort(G n [W' US]); 
for % •<— 1 to h 
Hi sort(V) n S); 

7 T /ii • /i2 • • • flh, 

return 7 r; 
end 

Table 1: An (m 2 + 2 m — 1)-approximation for the MBL problem. 

It implies that an approximation algorithm for the MB VS problem can be translated into an ap¬ 
proximation algorithm for the MBL problem with the same ratio. 

As in [j3]|, we denote by SCC-sort() an algorithm that decomposes a directed graph into its 
strong connected components and then topologically sorts these components. Also, let sort() de¬ 
note an algorithm that sorts the integer elements in each strongly connected component either 
in a descending order or an ascending order, as we described in the constructive proof of Theo¬ 
rem [3TT0J Note that a different definition of sort() was used in OJ, which always sorts integers 
in an ascending order. Table Q] summarizes the algorithm that is used to approximate the MBL 
problem, Approx-MBL. It is derived from the constructive proof of Theorem 13.101 and relies 
on an approximation algorithm for the MB VS problem that we are going to describe in the next 
subsection. Its correctness follows from Theorem 13 .101 

4.2 Approximation of the MB VS problem 

We start this subsection by introducing several more definitions. As similarly defined in 0, a path 
u * D v in (£, D) is said to be a shortcut of a type II conflict-cycle C, if: 

P Q 

- «,«6E(C) (we write P and Q the paths such that C = v —» + u —» + v ), 

- the cycle C = v —> + u — > * D v is also a conflict-cycle of type II, 

- W(Q) ^ 0 (using the shortcut removes at least one adjacency). 

A type II conflict-cycle is said to be minimal if it has no shortcut. On the other hand, a type I 
conflict-cycle is said to be minimal if there does not exist another type I conflict-cycle with a strict 
subset of W(C). Note that the definition of shortcut does not apply to the conflict-cycles of type 
I. The following lemma ensures that removing minimal conflict-cycles is enough to remove all the 
conflict-cycles. 
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Lemma 4.2 If an adjacency-order graph contains a conflict-cycle, then it also contains a minimal 
conflict-cycle. 

Proof. Let C be a conflict-cycle. Suppose that C is not minimal. If it is a conflict-cycle of 
type I, by definition, we may find another type I conflict-cycle C' with |PL(C )| < \W(C)\\ if it 
is a conflict-cycle of type II, we may use the shortcut to create another conflict-cycle C of type I 
also having \ W(C') \ < \W(C) |. Applied recursively, this process necessarily ends with a minimal 
conflict-cycle. ■ 

Lemma 4.3 Let C be a minimal conflict-cycle. Then, C is simple if and only if it is of type II. 

Proof. (=^) Since C is a simple conflict-cycle, by Lemma [3781 C cannot be of type I. Therefore, 
C must be a conflict-cycle of type II. 

(4=) By contradiction, suppose that C is not simple. Since C is of type II, by Lemma [3791 it 

must contain a simple conflict-subcycle C' of type II. So, we may write C = u + u % + u, 
where u e £(C) (see the proof of Lemma 13 .91) . Let R = u u be a path with an empty arc set. 

/ c' R 

We can see that C = u —> + u —> *u is a conflict-cycle and that W ( Q ) f 0 (since Q is a cycle 
of C ), so the path R is a shortcut of C. It hence leads to a contradiction that C is indeed given as a 
minimal conflict-cycle. ■ 

Let C be a cycle in G n with W (C) = {WflC), W 2 (C ), • • • , WflC)}, where W h (C) = {i-(i + 
1) I o,h < i < b h }, for each 1 < h < l. We call the vertices ah and b h the joints of C and, in 
particular, a h the low joint. Given a vertex i ■ (i + 1) e W(C), we say that a h and b h are the two 
joints associated to w in C if ah < i < b h . Note that joints are also defined in 0, but not in the 
same way. 

Our approximation algorithm for the MBVS problem is summarized in Table El As we can 
see, it consists of two main phases. In the first phrase, the adjacency-order graph Gn is repeatedly 
induced by deleting a set of low joints of a minimal type II conflict-cycle until there are no more 
minimal type II conflict-cycles (except for one case where m = 1 and |W(C)| = 1). In the second 
phase, the previously induced subgraph is further repeatedly induced by deleting the only two 
joints of a type I conflict-cycle until there are no more minimal type I conflict-cycles. It is worth 
noting that finding a minimal type II conflict-cycle is quite challenging, due to the presence of type 
I conflict-cycles in the adjacency-order graph. We will discuss the polynomial-time algorithms for 
finding type I and type II conflict-cycles in Subsection l5.2[ 


5 Performance Analysis 

5.1 Approximation ratio 

If C is given as a minimal conflict-cycle of type II, it must be simple by Lemma 1431 Hence, a joint 
e of C has exactly two incident arcs, one belonging to D(C ) and the other belonging to F(C). In 
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Algorithm Approx-MBVS 

input An adjacency-order graph Gn(V, E) 

output A breakpoint vertex set W" 

begin 

W" <- 0; 

while there exists in G'n [V — W } a minimal type II conflict-cycle C 
ifm = 1 and |W(C)| = 1 
J 4— the set of joints of C; 
else 

J 4— the set of low joints of C; 

W" 4- W" U {e F : eE J}; 

while there exists in G'n [V — W ] a minimal type I conflict-cycle C 
J 4— the set of joints of C; 

W" <- W" U {e F : e G J}; 

return W "; 
end 

Table 2: An (m 2 + 2 m — 1)-approximation for the MB VS problem 


this case, we denote by e F the other vertex (rather than e) of the arc from F(C), and by e D the 
other vertex (rather than e) of the arc from D(C). 

As defined in 0, for each « 6 E, we denote I(u) C {1,..., m} the number of the genetic 
maps in which u appears. Clearly, I(u) ^ 0. For each arc u —>d v G D, we use r)(u —>d v) to 
denote the numbering of a genetic map in which u and v appear in consecutive blocks. So, r/(u —> jy 
v ) e /(it) (T I(v). Given a minimal type II conflict-cycle C, we extend the notation r] to each of its 
joints e: let rj(e ) = r](e D —> e) if C uses the arc e D —> e; otherwise, let r/(e) = rj(e —> e D ). 

Lemma 5.1 /Q]/ Let e —>■ / be an arc of D, and let u 6 £ such that rj{e —)•£>/)€ /(it). Then one 
of the paths e u or u f appears in the graph (E. D). 

Lemma 5.2 /[£]/ Let C be a (not necessarily simple) cycle o/G'n- Let c 6 E, such that there exists 
a ,!) 6 E(C ') with a < c < b. Then, one of the following propositions is true: 

(i) C contains an arc u —*x v v < c < u; 

(ii) C contains both arcs c + 1 —^c-(c+l) and c ■ (c + 1) —tp c. 

We can further obtain the following lemma, which can be proved by using the same arguments 
as those for proving the preceding lemma. 

Lemma 5.3 Let C be a (not necessarily simple) cycle of G n- Let c E E, such that there exists 
a,b G E(C) with a < c < b. Then, one of the following propositions is true: 

(i) C contains an arc u —v with u < c < v; 
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(ii) C contains both arcs c4j?c-(c+l) and c ■ (c + 1) —c + 1. 

Proof. Define c + = {d\d > c}U {d ■ (d +l)\d > c } and c~ = {d\d < c} U {d ■ (d + l)\d < c}. 
Then, c + U {c • (c + 1)} U c~ is a partition of V. We show that when proposition (i) is false, 
proposition (ii) is necessarily true. Assume that proposition (i) is false. Since C contains vertices 
in both c + U {c • (c + 1)} and c~ (resp. b and a), it thus contains an arc u —* v with u G c~ 
and v G c + U {c • (c + 1)}. We must have u —* v G F; if otherwise, u —>• v G D implies 
u —y b 6 f (since u < v ), and proposition (i) would be true, a contradiction. Necessarily, u = c 
and v — c ■ (c + 1) (because there is no arc in F going out of c~ into c + ). So, C contains the arc 
c —> c ■ (c + 1). Using the same argument, we can show that there is an arc u —> v in C with 
u G {c- (c+l)}Uc~ and v G c + . Since u —> v cannot be in Y (since proposition (i) is false) nor 
in X (since these arcs go from c + to c ~), then it must be in F, and we can only have u = c ■ (c +1) 
and v = c + 1. So, C also uses the arc c • (c + 1) —> F c + 1, and thus proposition (ii) is true. ■ 

The following two lemmas already appeared verbatim in [j3]|, except that a type II conflict-cycle 
is additionally imposed here. However, due to a different definition of conflict-cycles, the proofs 
as given in [|3l are not sufficient Q. 

Lemma 5.4 Let C be a minimal type II conflict-cycle where three vertices a. e. f G E(C) are such 
that 

- C = u —» e —>d J —> u; 

- Each of the paths P\ and P 2 uses at least one vertex from W and at least one arc from D. 
Then rj(e -+ D f) £ I(u). 

Proof (We adapt the proof of Lemma 14 in 01 to our definition of conflict-cycles.) Since C 
is a minimal type II conflict-cycle, by Lemma l4~3l it must be simple. By contradiction, suppose 
that Tj{e —*d f ) ^ I( u )- Then, by Lemma I5T1 there exists a path R in D connecting either 

e to u or u to /. In the first case, we write P = P\ and Q = e ~^d f —^ + u, and in the 
second, P = P 2 and Q = u -4- + e —t D f, so that there exists a cycle C' = u A + e A * D u 
(resp., C = / —» + u —> * D f). Since C is a minimal type II conflict-cycle, then R cannot be 
a shortcut, and with W ( Q ) not being empty, cycle C' cannot be a conflict-cycle of type II. Let 
Wi{C') = {i ■ (i + l)|a < % < 6 }. Thus, by Lemma [3T6l for all c G Y{C'), we have c G [a, b], so 
that S (C') = {i\a < i < b} and |W(C 7 )| = 1. It turns out that V(C') C V{C). Note that R does 
not use any arc from F, so the vertices in WflC ) all come from the path P. Moreover, because the 
path P is part of the simple conflict-cycle C and | W(C') \ = 1, the path P (and, the cycles C' and C 
too) must use a path either a —> F b or b —> F a. W.l.o.g, this path is assumed to be a —> F b. 

Also note that P uses at least one arc from D(C). Let a — > D b' be such an arc, such that 
a G S(C') and b ' G S(C') (i.e., cl ^ cl ^ b and cl < b' < b). If a < b', we may write a 

1 One might argue that the corresponding proofs given in 0 shall be sufficient since a type II conflict-cycle is 
always a conflict-cycle according to the definition in 0. Note that, however, a minimal type II conflict-cycle may not 
be a minimal conflict-cycle as defined in 0. Therefore, those proofs are still not sufficient. 
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cycle C" = a —tob' —t* F b —>■ E (c) e —>•£> / — + u E (c) a ', which does not use any vertices 
in W(P 3 ) where the path P 3 = b' -^f(p) o!. Otherwise, a > b', so we may write a cycle 
C" — a —> D b' —>f(p) a , which does not use any vertices in W(P 2 ). In either case, we can see 
that C" is a subcycle of C, implying that the latter is not a simple cycle; a contradiction. ■ 

Lemma 5.5 Let C be a minimal type II conflict-cycle, with A > 5 joints. Let e and f be two non 
consecutive joints ofC. Then 77(e) f r/(/). 

Proof (Please refer to the proof of Lemma 15 in 10, together with Lemma [5T4l above. ) ■ 

Lemma 5.6 LetC be a minimal type II conflict-cycle with W\ (C) = {z • (1 + 1) \ a < 1 < b) being 
an element of W(C). Then, we have a D ^ [a, b] and b D [a, b\. 

Proof. First note that a aP. By definition, the cycle C uses an arc from D either a —> a D or 
a D — y a. W.l.o.g., we assume that this arc is a — > a D G D(C). Since C is a minimal type II conflict- 
cycle, it must be simple (by Lemma 1431) . Moreover, WflC) = {i ■ {i + 1) | a < i < b} G W(C) 
implies that C uses a path either a —Pp b or b —> F a. In the current case, however, this path can 
only be b a since C uses the arc a —> a D too. 

By contradiction, assume that a D G [a, b]; further, a < a D < b since a a D . It hence implies 
that there exists a path a D —> F a in C. We may write a cycle C = a —y a D -P- F a, for which any 
vertex e G S(C ) is such that a < e < b. On the other hand, by Lemma [3T6l there exists a vertex 
c G S (C) such that c f. [a, b\. Thus, c ^ S(C ), so that C’ is a subcycle of C. It however contradicts 
the fact that C is a simple cycle. This proves a D [a, b\. By using the same arguments above, we 
can also prove b D ^ [a, b\. ■ 

Lemma 5.7 Let C be a minimal type II conflict-cycle with |W(C)| > 2 and WflC) = {i ■ [i + 
1) | cl < i < b} being an element ofW(C). Let c be a vertex in £. 

(i) If a < c < b and 77(a) G /(c), then a D and c appear in the same block of the genetic map 
77(a). 

(ii) If a < c < b and rj(b ) G /(c), then b D and c appear in the same block of the genetic map 

dis¬ 


proof We present below the proof of (i) only, because (ii) can be proved similarly. Since 
Wi(C) = {i ■ (i + 1) | a < i < b}, the cycle C uses either the path a —> F b or b —> F a. W.l.o.g., 
we assume that C uses the path a —t F b. Because a < c < b, this path goes via the vertex c. 
Since C is a minimal type II conflict-cycle, by Lemma [5761 we have a D f. [a, b\. Moreover, by 
definition, W (C) shall contain another element W 2 {C) = {i ■ (i + 1) | a < i < b }, where both 
vertices a and b shall be located on the path b D —> a D . W.l.o.g., we assume that a is visited 
before b in the path b D — > a D . Thus, we may write P the path a D —a —> F c and Q the path 
c ~^* F b b D -)■* a -^- F l> ->* a D . 

Since 77(a) G /(c), a D and c (and a as well) appear in the same genetic map numbered 77(a). 
So, we distinguish three cases below. 
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- In the first case, there exists the path R = a D —c in (E, D). Let C' = c —>■ a D c. 
Note that no vertex in W appears in R, so W 2 (C) = {i ■ (i + 1) | a < i < b'} must appear 
as an element of W(C ). By Lemma I3T1 we have b [a , b }. Then, by Lemma 13761 C is a 
conflict-cycle of type II. With W (P) not being empty, it follows that R is a shortcut of C, a 
contradiction. 

- In the second case, there exists the path R = c — >d a D in (E, D). Let C' = c A- a D A c. 
Note that no vertex in W appears in R, so W\(C') — {i ■ (i + 1) j a < i < c} must appear as 
an element of W(C ). By Lemma 15761 we have a D [a, b\, which implies that a D ^ [a, c]. 
By Lemma [3T6l C is a conflict-cycle of type II. With W(Q) not being empty, it follows that 
R is a shortcut of C, a contradiction. 

- In the third case, a D and c are incomparable in (E, I)). Since they appear in the same genetic 
map numbered 77(a), they should appear in the same block of this map. 


It can be seen that the proof of the preceding lemma also implies the following lemma. 

Lemma 5.8 Let C be a minimal type II conflict-cycle with WflC) = {z • (1 + 1) | a < 1 < b} being 
an element ofW(C). Let c be a vertex in E. 

(i) If a < c < b and rf/i) G /(c), then a D and c appear in the same block of the genetic map 

'/(a)- 

(ii) If a < c < b and rj(b ) G /(c), then b D and c appear in the same block of the genetic map 

d(b)- 

Lemma 5.9 Let iv = v ■ (v + 1) G W. Then, there exists at most one minimal type I conflict-cycle 
being considered during the execution of Approx-MBVS going via w. 

Proof By contradiction, assume that C\ and C 2 are two minimal type I conflict-cycles being 
considered during the execution of Approx-MBVS, in this order, such that w G W (Cf) D W (C 2 ). 
By definition, let W(Cf) — {*■ (i + 1) | a\ < i < 61 } and W[C 2 ) = {i- (* + l) | a 2 < i < b 2 }. Since 
w = v ■ (v + 1) G W(Cf) D W(C 2 ), we have ai < v < bi and a 2 < v < b 2 . On the other hand, 
because the vertices af = ai • (ai + 1) and bf = (61 — 1 ) • bi are removed when C\ is considered, 
they cannot appear in C 2 so that a\ < a 2 and bi > b 2 . Thus, a 1 < a 2 < b 2 < bi, so that W{C 2 ) 
has a strict subset of W(Cf). This, however, contradicts the fact that C\ is a minimal conflict-cycle. 


Lemma 5.10 Let w = v ■ (v + 1) G W and m = 1. Then, there exists at most one minimal (type I 
or type II) conflict-cycle being considered during the execution of Approx-MBVS going via w. 
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Proof. By contradiction, assume C\ and C 2 are two minimal conflict-cycles being considered 
during the execution of Approx-MBVS, in this order, such that w G W{Cf) D W(C 2 ). By 
definition, let W{Cf) = {i ■ (i + 1) | ai < i < b 3 } and W(C 2 ) — {i ■ {% + 1) | a 2 < i < b 2 }. Since 
w = v ■ (v + 1) G W(C\) fl W(C 2 ), we have a± < v < b\ and a 2 < v < b 2 . On the other hand, 
because the vertex of = a\ ■ (ai + 1) is removed when C\ is considered, a\ cannot appear in C 2 so 
that ai < a 2 . Thus, a\ < a 2 < v < b 3 . 

By Lemma [5791 C\ can only be of type II. By Lemma [5/71 we further know that |W(Ci)| = 1 
(since a\ <v < bf). So, the vertex 6 f = (b 3 — 1) ■ bi will be removed too when C\ is considered. 
Hence, b 2 < bi, so that a\ < a 2 < v < b 2 < b 3 . 

Next we show that there exists a path u -P-d v such that u G \a 2 . b 2 ] and v G [a 2 , b 2 \. To this 
end, we distinguish two cases. In the first case, C 2 is assumed to be of type I. By definition of 
the type I conflict-cycles, there must exist a desired path since £(C 2 ) = {i\a 2 < i < b 2 }. In the 
second case, C 2 is assumed to be of type II. If C 2 uses the arc a 2 —> af, then there must exist a 
path a 2 — b 2 . Otherwise, C 2 uses the arc af —> a 2 , then there must exist a path b 2 —> D a 2 . So, 
we can always find a path u — v such that u G [a 2 , b 2 ] and v G [a 2 , b 2 \, regardless of the type of 

C 2 . We further obtain a\ < u < bi and a\ < v < b\, since a\ < a 2 < v < b 2 < b\. By applying 

Lemma [5751 with (Ci,u) and (Ci,v) successively, we obtain 

- af and u appear in the same block of the only genetic map, 

- af and v appear in the same block of the only genetic map. 

Therefore, u and v both come from the same block. However, the existence of the path u —>d v 
instead implies that they shall not appear in the same block, a contradiction. ■ 

Lemma 5.11 Let w — v ■ (v + 1) G W, C u C 2 and C 3 three minimal (either type 1 or type II) 
conflict-cycles being considered during the execution of APPROX-MBVS, in this order, such that 
w G C\ r\C 2 fl C 3 . Denote respectively by a 3 , a 2 and a 3 the low joints associated to w in C\, C 2 and 
C 3 . Then we cannot have rj(ai) = r](a 2 ) = rj(a 3 ). 

Proof By lemma 15.91 C\ and C 2 must be conflict-cycles of type II, whereas C 3 could be of 
either type I or type II. 

By contradiction, assume that 77 = r)(ai) = r](a 2 ) = 77 ( 03 ). Vertices a 3 , a 2 and a 3 are low 
joints associated to w = v ■ (v + 1), so a 3 < v, a 2 < v and a 3 < v. The vertex a{ = a 3 ■ (a 3 + 1) 
is removed when C\ is considered, so it cannot appear in C 2 or C 3 . Thus, a { < a 2 and a x < a 3 . 
Similarly, we can have a 2 < a 3 . Let WflCf) = {7 • (i + 1) | a 3 < i < b 3 } (resp., Wi(C 2 ) = 
{i ■ (i + 1) | a 2 < i < b 2 }) be the element of W(Ci) (resp., W (C 2 )) that contains w = v ■ (v + 1). 
Thus, v < bi and v < b 2 , so a 2 < b\, a 3 < b 3 and a 3 < b 2 . Then, we may apply Lemma [5781 with 
(Ci, a 2 ), (Ci, a 3 ) and (C 2 , a 3 ) successively to obtain 

- of and a 2 appear in the same block of genetic map 77 , 

- of and a 3 appear in the same block of genetic map 77 , 

- of and a 3 appear in the same block of genetic map 77 . 
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Therefore, a 2 and af both come from the same block of genetic map 77 , which contradicts 77 (a 2 ) = 
77 (in the genetic map 77 (a 2 ), a 2 and af appear in consecutive blocks). ■ 

Lemma 5.12 Let w = v ■ (v + 1) G If 7 , Ci and C 2 two minimal conflict-cycles being considered 
during the execution of Approx-MBVS, in this order, such that w G C\ D C 2 and W(C \) > 2. 
Denote respectively by ai onJ a 2 the low joints associated to w in C\ and C 2 , and by hi the other 
joint (rather than cli) associated to w in C\. Then we cannot have 77 ( 01 ) = r](bi) = 77 (a 2 ). 

Proo/ By lemma [5791 Ci must be a conflict-cycle of type II, whereas C 2 could be of either type 
I or type II. 

By contradiction, assume that 77 = 77 ( 01 .) = 77 ( 61 ) = 77 (a 2 ). As shown in the preceding lemma, 
we have ai < a 2 < v < b\. Then, we may apply Lemma 15771 to obtain 

- of and a 2 appear in the same block of genetic map 77 , 

- bf and a 2 appear in the same block of genetic map 77 , 

- a 1 and bf appear in the same block of genetic map 77 . 

Therefore, ai and a f both come from the same block of genetic map 77 , which contradicts 77 ( 01 ) = 
77 (in the genetic map 77 ( 01 ), ai and of appear in consecutive blocks). ■ 

Lemma 5.13 Let w G W and C the set of all the minimal conflict-cycles being considered during 
the execution of APPROX-MBVS going via w. Let J w denote the total number of joints being 
selected in these cycles (in order to remove adjacencies). Then, J w < rn 2 + 2m — 1. 

Proof We write w = v ■ (v + 1) G W, and C = (Ci,..., C q } the set of the q conflict-cycles 
being considered, in this order, during the execution of Approx-MBVS. In each cycle Ch, w can 
be associated to a low joint v h and to the corresponding deleted vertex w h — vf = v h ■ (v h + 1). 
We write A h the number of joints of Ch- If Ch is a minimal type II conflict-cycle, then is the 
number of low joints (and thus the maximum number of deleted vertices) in this cycle. Otherwise, 
it is of type I, so = 1, but the number of deleted vertices in this cycle could be up to 2. Since Wh 
is deleted while Ch is considered, we have Wh f W(C h >) and ty. < v h > < v, for all h' > h. Indeed, 
Vm G {v h >,... ,v}, the vertex u ■ (u + 1) belongs to W(C h >). 

By Lemma [5791 there exists at most one minimal type I conflict-cycle being considered during 
the execution of Approx-MBVS going via w. Thus, the first q — 1 cycles must be all of type II, 
while the last cycle C q may be of either type I or type II, depending on whether a minimal type I 
conflict-cycle is considered or not. 

Consider now the list ( 77 (^ 1 ), 77 (w 2 ), •• • , rj(v q )). Unlike in a set, duplicate values are allowed 
in a list. By Lemma [5.111 we know that no value can appear more than twice in the list. Hence, 
q < 2m. Indeed, we can further show below that q < 2m — 1 when Ai > 4 (i.e., when |W(Ci)| > 
2). By contradiction, suppose that q = 2m when Ai > 4. So, q > 2, which implies that there 
are at least two minimal conflict-cycles being considered during the execution of Approx-MBVS 
going via w. By Lemma [5T9l the first conflict-cycle C\ must be of type II. Let e be the other joint 
rather than v\ in C\ associated to w. Because q = 2m, by Lemma [5.111 we can find exactly two 
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distinct vertices v t and v 3 such that / 7 (e) = q(v,) = rj(vj) and l<i<j<q = 2m. Recall that 
Vi and vj are the respective low joints of C, and C 3 that are both associated to w. So, neither v r nor 
Vj coincide with e (but v, might coincide with Vi) and, moreover, V\ < v r < v 3 < e. By using 
Lemma [5771 with (Ci,Vi), (C 3 ,Vj) and {C % , Vj) successively, we obtain 

- e D and v t appear in the same block of genetic map 77 , 

- e D and v 3 appear in the same block of genetic map 77 , 

- v[ J and Vj appear in the same block of genetic map //. 

It turns out that both Vi and vf come from the same block of genetic map 77 , which contradicts 
the fact that 7 ;, and nf shall appear in consecutive blocks. So, this proves that q < 2m — 1 when 

Ai > 4. 

Consider now the list (rj(vh+i), r](vh+ 2 ), • • • , 77 (v q )). Let m\ and m 2 denote respectively the 
number of unique values and the number of duplicate values in the above list (duplicated values 
being counted only once). By Lemma B.lli we know that no value can appear more than twice in 
the list. Then, we obtain the following equation. 


777.1 + 2 7772 = q — h. (1) 

Let us assume for a moment that A h > 5, i.e., Ch has more than four joints. Let ei, e 2 , e 3 and e 4 
be four joints such that Ch uses the path e\ — e 2 —w —>p e 3 — e 4 . Note that either e 2 = Vh 
or e 3 = Vh- And, for all ti > h, the vertex v h > appears between joints e 2 and e 3 , so we may write 
C = e 1 —e 2 —>p v h > e 3 —e 4 — e 4 . Consider a joint e* rather than e 1; e 2 , e 3 and e 4 , 
for all i G [5, A^]. We have either C = e\ — e 2 — v h > —>p e 3 — e 4 —>■ + —>d ef —> e 4 or 

C = e 1 —tp, e 2 —>p v h > -^-p e 3 — e 4 ^- + ef —> D e* —> e\. In either case, using Lemma 15741 with 
three vertices v h >, ei and ef , we have 77 (ef ^ I (v h ' '), for all i e [5, A/, and all h' > h. In other 
words, for each value 77 counted into 777 ! or m 2 , we cannot have any joint e t for i e [5, X h ] such 
that 77 = r](ei). 

Consider now the list ( 77 ( 01 ), 77 ( 02 ), 77 (e 3 ), 77 (e 4 )). Let m 3 and t? 7 4 denote the number of values 
(duplicated values being counted only once) in this list that appear or do not appear in the preceding 
list (77 (u/h-i), 77 ( 77 ^+ 2 ),..., rj(v q )), respectively. First, note that ei and e 3 are two non consecutive 
joints of Ch- By Lemma 1531 we cannot have 77 ( 61 ) = r)(e 3 ), which implies that 

777 3 + 777 4 > 2. (2) 

Then, consider each value 77 counted into m 2 . By definition of m 2 , we have two distinct vertices n* 
and Vj such that 77 = rj(vi) = r](vj) and h <i < j <q. By using the same arguments above as in 
the preceding paragraph, we can show that this value 77 won’t be counted into m 3 . It hence follows 
that 

777 3 < 777 1 . (3) 

In addition, for each value 77 counted into m 4 , by Lemma 1531 we cannot have two distinct joints e* 

and 6j for i,j e [5, A/J such that 77 = rj(ei) = r)(ej). 
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To summarize, for each value 77 counted into mi or m 2 , there is no joint e* for i e [5, A h] such 
that 77 = r](ei). For each value 77 counted into m 4 , there exists at most one joint e* for i 6 [5, X h \ 
such that 77 = //(e,). For any other possible value 77 , there exist at most two joints e, and e ? for 
i, j G [5, Aft] such that 77 = r](ei) = r){ej ); moreover, the total of such possible 77 values (i.e., all the 
77 values attained by the joints other than ei, e 2 , e 3 and e 4 ) is no more than 777 — mi — m 2 — m 4 . 
Based on these observations, we can deduce the following inequality: 


Xh — 4 < 2(m — 777 4 — ?77 2 — 777 4 ) + ?77 4 . 


(4) 


Note that A h is always even. Then, by using the above Equality [Hand Inequalities [2l [3] and [4] we 
obtain the following inequality for A h > 5: 


A h 
T 


< 777 — 



(5) 


This inequality also holds when A^ = 2 because q < 2m and h > 1. When A h = 4 , it does not 
hold only when q = 2m and h — 1. However, this condition will never be met because we have 
shown above that q < m — 1 when Ai = 4 . To summarize, the above inequality holds for all 
A h > 2 , although it is initially derived based on the assumption that A/, > 5 . Further note that the 
above inequality holds for all 777 > 1 . 

Let us assume for a moment that 777 > 2 . By Lemma [531 we have that A h < 2m when m > 2. 
Thus, ^ < min (777,777 — [2^] + l) holds for all the conflict-cycles being considered during the 
execution of Approx-MBVS, regardless of their types. 

Recall that, for a possible minimal type I conflict-cycle C q , the algorithm will select two joints 
rather than one joint (as computed by 4^). By incorporating this, we then obtain (assume that 
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max{ 2 , + E if 
h= 1 

q —1 

< m + XI m i n (m , m ~ \^-] + l) 

h=l 

9—1 

= m + (m - |"|] + l) 

h=l 
2m—1 

< m + J2 { m ~ I'll +1) 

/i=i 

2m—1 2m—1 

= m+ {m + 1 ) - £ [§] 

h= 1 /t=l 

= 2m 2 + 2m — 1 — ^m + 2 'ff h^j 
= m 2 + 2m — 1. 

In case of m = 1, by Lemma [5.101 we have q — 1 (we assume here that at least one conflict- 
cycle being considered going via w\ otherwise, J w = 0). No matter whether this cycle C\ is of type 
I, of type II with |W(Ci)| = 1, or of type II with |W(Ci)| > 2, the algorithm will select exactly 
two joints only, thereby making J w < m 2 + 2m — 1 still true. In conclusion, J w < m 2 + 2m — 1 
holds for all m > 1 . ■ 

Corollary 5.14 Let w E W and C the set of all the conflict-cycles being considered during the 
execution of APPROX-MB VS going via w. Then, the total number of vertices in W to be removed 
from cycles of C is bounded from the above by m 2 + 2m — 1. 

Theorem 5.15 Algorithm Approx-MBVS achieves an {m 2 + 2m — 1)-approximation for the 
MB VS problem, where m is the number of genetic maps used to create the input adjacency-order 
graph. 

Proof Correctness of Algorithm Approx-MBVS follows from Corollary 15.141 since the al¬ 
gorithm removes at least one vertex from each conflict-cycle. Let W° = ... -,wf) be an 

optimal solution of size k, i.e., a minimum breakpoint vertex set of G*n- For each <> the algorithm 
deletes at most (m 2 + 2m — 1) adjacencies of W (by Corollary 15.141) . Since every cycle being 
considered by the algorithm goes through some <, the total size of the output solution is at most 
k ■ (m 2 + 2m — 1). The next subsection shows that the algorithm can be executed in polynomial 
time. ■ 


m > 2) 


Jw 
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5.2 Running time 

The remaining question in the algorithm Approx-MB VS is whether there exists any polynomial¬ 
time algorithm to find a minimal conflict-cycle from an induced subgraph G'n \W U £]. Since the 
algorithm considers all the type II conflict-cycles before any type I conflict-cycle is considered, we 
present first the algorithm to find a minimal conflict-cycle of type II in the below. 

5.2.1 Finding a minimal type II conflict-cycle 

First of all, we can develop a procedure to determine whether a given cycle is a conflict-cycle (fol¬ 
lowing the definition) and, if it is, further determine whether it is of type I or of type II (following 
Lemma 1+61) . We denote this procedure by CC//-check(), and note that it can be executed in ()(n) 
time. 

Lemma 5.16 Let W' be a subset ofW. If Gu\W' U £] contains a type II conflict-cycle, then it 

also contains a type II conflict-cycle C = a c -4 b —a such that (i) a, b, c G £, (ii) neither 
a < c < b nor b < c < a, and (iii) P and Q are the respective shortest paths between two vertices 
in the induced subgraph Gn\W US] where W =W — {(a — 1)-a, a -(a + l), (b — 1) -b, b- (6+1)}. 

Proof. Since Gu[W' U £] contains a conflict-cycle of type II, by Lemma [3791 it also contains a 
simple conflict-cycle of type II. Let this simple conflict-cycle be C', with W\ (C') — {i ■ (i + 1) |ai < 
i < bi}. By Lemma [3761 there exists a vertex c G £(C') such that c f [a\, bf. So, we have either 
C' — oq —y c —^ b\ —>p ai or C! = ai —bi —> c —> a\. In the first case, we let a = a 1 and 
b — b ]; in the second case, let a = bi and b — a\. In both cases, C' uses the path R = a —> c —> b. 

Recall that C! is simple, so R won’t traverse any vertices from the set {(a — 1) • a, a ■ (a + 
1), (b — 1) • 6, b ■ (b + 1)}. It turns out that the path R is fully contained in the induced subgraph 
Gifyv” US] where W" = W' — {(a — 1) • a, a • (a + 1), (b— 1) • b, b- (b+ 1)}. Since there exists in 
G'n \W" U £] an path from a to c and also an path from c to b, we may write their respective shortest 

paths a A c and c b. Thus, we obtain a new cycle C = a A c b — a. Note that the path 

a —> c —> b could not traverse any vertex from the set {(a — 1 ) • a, a ■ (a + 1 ), (b — 1) • 6 , b- ( 6 + 1 )}, 
so that {i ■ (i + l)|ai < i < bi} is also an element of W (C) and, moreover, c ^ [ai, bf. It hence 
follows from Lemma [+ 6 l that C is a conflict-cycle of type II. ■ 

Based on the above lemma, we propose a procedure to determine whether a given graph 
G'n [W U £] contains a type II conflict-cycle and, if any, to report one. It is done by conduct¬ 
ing four tests for all triples of distinct vertices ( a,b,c) G £ x £ x £: (i) whether c ^ [a, b] 
if a < b and c ^ [b, a] if b < a (taking 0(n) time), (ii) whether there exist all the vertices of 
{i ■ (i + 1) | a < i < b or b < i < a} in Gn \W' U £] (taking 0(n) time), (iii) whether there exists a 
shortest path a A c between a and c in GxflfV" U £] (taking 0(n 2 ) time), and (iv) whether there 
exists a shortest path c %■ b between c and b in Gn\W" U£] (taking Ofn 2 ) time). If a triple (a, 6 , c) 

passes all the four tests, then we find a type II conflict-cycle C = a —» c —* b —a. If, instead, no 
triples in £ x £ x £ can pass them, then we know that Gu\W U £] contains no conflict-cycles of 
type II. We denote this procedure by CC//-sccd(), and note that it can be executed in time Q(n 5 ). 
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It is worth noting that the conflict-cycle C found by the above procedure CC//-seeding() is 
not necessarily simple. If C is not simple, by Lemma [3791 we know that there must exist a simple 
type II conflict-subcycle of C. To find it, we propose a procedure, called CC//-simplify(), which 
works by mainly applying CC//-check () to every simple subcycle of C. Note that the procedure 
CC//-simplify() can also be executed in 0(n ) time. 

By applying the procedures CC//-seed() and CC//-simplify() successively, we may obtain a 
simple type II conflict-cycle (if any) from Gn\W U £]. The next lemma then tells us how to find 
a minimal conflict-cycle of type II. 

Lemma 5.17 Let C be a simple conflict-cycle of type II. If it has a shortcut, then it also contains a 
shortcut R = u —^ * D w * D v such that (i) u,v G £(C), (ii) w G £, and (iii) R\ and R 2 are the 
respective shortest paths between two vertices in (£,£>). 

Proof Since C has a shortcut, let this shortcut be the path u ---> f v (note that u f v because C 

P Q 

is simple). By definition, we know that (i) u, v G £(C) , so we may write C = v —* + u —» + v, (ii) 

f p p' 

the cycle C = v —» + u —> * D v is also a conflict-cycle of type II, and (iii) W ( Q ) 0. 

Let WflC') = {i ■ (i + l)|ai < i < bi}. Since C' is a conflict-cycle of type II, by Lemma [3761 
there exists a vertex w G £(C') such that w ^ [ai, b f. If w is located on the path P, then let 
w = b; otherwise, w is located on the path R , and we instead let w = w . We can see that, in 
both cases, there exits in (£, D) at least one path from u to w and also at least one path from w to 

v. Let u * D w and w * D v denote their respective shortest paths, so we may write the path 

R = u — * D w * D v. Thus, we obtain a new cycle C" = v A + u A * D v. To show R is a 
shortcut of C, it suffices by showing that the cycle C" is a conflict-cycle of type II, as done below. 

Note that W(C ) = W(C ), since neither R nor R use any vertex from W. Consequently, 
W(C') = W(C"), which implies that {i ■ (i + l)|ai < i < £»i} is also an element of W[C" ). Further 
note that, no matter in which case the vertex w is defined, the vertex w is always in E(C ) so that 
w 4 [ai. 6 -i ]. Thus, it follows from Lemma 13.61 that C is a conflict-cycle of type II. ■ 

Based on the above lemma, we propose a procedure El to determine whether a given simple 
type II conflict-cycle C is minimal and, if it is not minimal, to report a type II conflict-cycle C 
with W(C') < W{C). It is done by conducting four tests for all triples of vertices ( u,v,w) G 

£(C) x £(C) x £: (i) whether W(Q) f 0 where C = v A + u + v (taking 0(n) time), (ii) 
whether there exists a shortest path u -4- * D w between u and w in (£,£)) (taking 0{n 2 ) time), 
(iii) whether there exists a shortest path w ~> * D v between w and v in (E, D) (taking 0(n 2 ) time), 
and (iv) whether the cycle C' = v A + u * D w — * D v is a conflict-cycle of type II by using 
the procedure CC//-check() (taking 0(n ) time). If a triple (u, v, w) passes all the four tests, then 
we find a type II conflict-cycle C' such that W{C') < W(C) (i.e., the path u — * D w — * D v is a 
shortcut of C ). If, instead, no triples in £(C) x £(C) x £ can pass them, then we know that C is 

2 The main challenge in developing such a procedure is to ensure that it would not end up with a conflict-cycle of 
type I. 
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Algorithm Find-a-Minimal-Type-II-Conflict-Cycle 
input An induced adjacency-order subgraph Gn\W U £] 
output A minimal type II conflict-cycle C 

begin 

C <— CC//-seed(); 

C' «- C ; 
while C ‘V 0 
C^C'; 

C <— CC//-simplify(C); 

C' -t— CC/j'-reduce (C); 

return C; 
end 

Table 3: A polynomial-time algorithm for finding a minimal type II conflict-cycle from an induced 
adjacency-order subgraph G^[I / F , U £]. Note that Gu\W' U £] = G n if W' = W. 

already minimal. We denote this procedure by CC//-reduce(), and note that it can be executed in 
time 0(n 5 ). 

We present in Table [3] the algorithm used to find a minimal type II conflict-cycle from an 
adjacency-order (sub)graph. Note that W{C') < W{C) holds after each execution of the while 
loop, so that the while loop cannot be repeated more than n times. Thus, we can see that this 
algorithm can be executed in O (n 6 ) time. 

5.2.2 Finding a minimal type I conflict-cycle 

The algorithm Approx-MBVS starts the search for the minimal type I conflict-cycle only when 
there are no longer any type II conflict-cycles contained in the subgraph G'n [W US]. The following 
lemma assists us in developing an algorithm to find a minimal type I conflict-cycle from Gn\W' U 
S], 

Lemma 5.18 Let W' be a subset ofW. IfGn[W' U S] contains a type I conflict-cycle, then it also 
contains a type I conflict-cycle C = 0 \ '-fl l )] ~^* F a 2 -G b 2 a\ such that (i) the arcs e\ £ X 
and e 2 G Y, (ii) V(C) — {i ■ (i + 1) | a < i < f>} U {i \ a < i < 6 } where a = min{ai, bi, a 2 , b 2 } 
and b = max{ai, bi, a 2 , b 2 }, and (iii) D(C ) = {ei, e 2 }. 

Proof. Since Gn\W' U S] contains a type I conflict-cycle, by definition, it shall use one arc 
e\ — ai —y bi e X, one arc e 2 = a 2 —> b 2 e Y, and all the vertices of {i ■ (i + 1) | a < i < 
b} U {i | a < i < b} if we let a = minjai, b±, a 2 , b 2 } and b = max{ai, b\, a 2 , b 2 }. With these 
arcs and vertices, we are able to construct a desired type I conflict-cycle C through a case study, as 
illustrated in Figured ■ 

Based on the above lemma, we propose the following algorithm to find a minimal type I 
conflict-cycle (if any). For all pairs of arcs (ei,e 2 ) £ X x Y, where e\ — ai — > bi £ X and 
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( 2 ) 


( 3 ) 


( 4 ) 


( 5 ) 


( 6 ) 


Figure 2: A conflict-cycle of type I can be formed for each of the six general cases as follows: 
(1) ai yx b\ b>2 —>f 02 -Ay ^2 -a^ b\ ~Af a i> (2) a\ —>x b\ —>p °\ ~^*f °2 -Ay 

&2 ->F a 2 —>*F °1> (3) a \ ~^X b\ a 2 “Ay ^>2 ~^p b\ —}*p &2 ^F a h (4) 0\ ~>X b\ )■ F 

0,2 —^F a l ~^F a 2 "Ay ^2 ^F a \, (5) 0\ -Ax W —}p a l ^F b\ —>*p 02 “Ay ^2 0\, (6) 

Oi —^x b\ —}*f 02 -Ay &2 —^F 02 —^F &2 —^ j? ®1- 


e 2 = 02 — > b 2 G F, first compute a = min{ai, £>i, a 2 , 62 } and b = max{ai, bi, a 2 , 6 2 } and then test 
if there exists a path a —tp b from a to b using arcs all from F (each taking 0(n) time). Among all 
those pairs that passed the test, the one that attains the smallest value of (b — a) will be returned as 
a minimal type I conflict-cycle. Note that this algorithm can be executed in 0(n 5 ) time since the 
total number of arc pairs is no more than 0 (n 4 ). 

Consider now the whole execution of the algorithm Approx-MBVS. Note that two while 
loops of Approx-MBL cannot each be repeated more than n times because we delete at least one 
vertex in F for each minimal conflict-cycle C to be considered. Therefore, the algorithm Approx- 
MBVS (and hence Algorithm Approx-MBL) can be executed in 0(n 7 ) time. The main result of 
this paper thus follows (the approximation ratio follows from Theorem 15.151) . 

Theorem 5.19 Algorithm Approx-MBL achieves an approximation ratio of (m 2 + 2m — 1) for 
the MBL problem and runs in 0{n 7 ) time, where m is the number of genetic maps used to create 
the input partial order and n the total number of distinct genes appearing in these maps. 


6 Conclusions 

In this paper, we have studied the MBL problem in its original version, i.e., it assumes that gene 
strandedness is not available in the input genetic maps. We found that the approximation algorithm 
proposed in 0] for the MBL problem is not applicable here because it implicitly requires the 
availability of gene strandedness. Therefore, we revised the definition of conflict-cycle in the 
adjacency-order graphs, and then developed an approximation algorithm by basically generalizing 
the algorithm in 01. It achieves a ratio of (m 2 + 2m — 1) and runs in 0(n 7 ) time, where m is the 
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number of genetic maps used to construct the input partial order and n the total number of distinct 
genes in these maps. We believe that the same approximation ratio also applies to the special 
variant of the MBL problem studied in |[3]|, thereby achieving an improved approximation ratio 
over the previous one (m 2 + 4m — 4) given in Q. In the future, it is very interesting to investigate 
whether an 0(m )-approximation can be achieved for the MBL problem. 
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